\title{Inference}

{{navbar}}

\subsubsection{Inference}

We describe how to perform inference in probabilistic models.
For background, see the
\href{/tutorials/inference}{Inference tutorial}.

Suppose we have a model $p(\mathbf{x}, \mathbf{z}, \beta)$ of data $\mathbf{x}_{\text{train}}$ with latent variables $(\mathbf{z}, \beta)$.
Consider the posterior inference problem,
\begin{equation*}
q(\mathbf{z}, \beta)\approx p(\mathbf{z}, \beta\mid \mathbf{x}_{\text{train}}),
\end{equation*}
in which the task is to approximate the posterior
$p(\mathbf{z}, \beta\mid \mathbf{x}_{\text{train}})$
using a family of distributions, $q(\mathbf{z},\beta; \lambda)$,
indexed by parameters $\lambda$.

In Edward, let \texttt{z} and \texttt{beta} be latent variables in the model,
where we observe the random variable \texttt{x} with
data \texttt{x\_train}.
Let \texttt{qz} and \texttt{qbeta} be random variables defined to
approximate the posterior.
We write this problem as follows:

\begin{lstlisting}[language=Python]
inference = ed.Inference({z: qz, beta: qbeta}, {x: x_train})
\end{lstlisting}

\texttt{Inference} is an abstract class which takes two inputs.  The
first is a collection of latent random variables \texttt{beta} and
\texttt{z}, along with ``posterior variables'' \texttt{qbeta} and
\texttt{qz}, which are associated to their respective latent
variables.  The second is a collection of observed random variables
\texttt{x}, which is associated to the data \texttt{x\_train}.

Inference adjusts parameters of the distribution of \texttt{qbeta}
and \texttt{qz} to be close to the
posterior $p(\mathbf{z}, \beta\,|\,\mathbf{x}_{\text{train}})$.

Running inference is as simple as running one method.

\begin{lstlisting}[language=Python]
inference = ed.Inference({z: qz, beta: qbeta}, {x: x_train})
inference.run()
\end{lstlisting}

Inference also supports fine control of the training procedure.

\begin{lstlisting}[language=Python]
inference = ed.Inference({z: qz, beta: qbeta}, {x: x_train})
inference.initialize()

tf.global_variables_initializer().run()

for _ in range(inference.n_iter):
  info_dict = inference.update()
  inference.print_progress(info_dict)

inference.finalize()
\end{lstlisting}

\texttt{initialize()} builds the algorithm's update rules
(computational graph) for $\lambda$;
\texttt{tf.global\_variables\_initializer().run()} initializes $\lambda$
(TensorFlow variables in the graph);
\texttt{update()} runs the graph once to update
$\lambda$, which is called in a loop until convergence;
\texttt{finalize()} runs any computation as the algorithm
terminates.

The \texttt{run()} method is a simple wrapper for this procedure.

\subsubsection{Other Settings}

We highlight other settings during inference.

\textbf{Model parameters}.
Model parameters are parameters in a model that we will always compute
point estimates for and not be uncertain about.
They are defined with \texttt{tf.Variable}s, where the inference
problem is
\begin{equation*}
\hat{\theta} \leftarrow^{\text{optimize}}
p(\mathbf{x}_{\text{train}}; \theta)
\end{equation*}

\begin{lstlisting}[language=Python]
from edward.models import Normal

theta = tf.Variable(0.0)
x = Normal(loc=tf.ones(10) * theta, scale=1.0)

inference = ed.Inference({}, {x: x_train})
\end{lstlisting}

Only a subset of inference algorithms support estimation of model
parameters.
(Note also that this inference example does not have any latent
variables. It is only about estimating \texttt{theta} given that we
observe $\mathbf{x} = \mathbf{x}_{\text{train}}$. We can add them so
that inference is both posterior inference and parameter estimation.)

For example, model parameters are useful when applying neural networks
from high-level libraries such as Keras and TensorFlow Slim. See
the \href{/api/model-compositionality}{model compositionality} page
for more details.

\textbf{Conditional inference}.
In conditional inference, only a subset of the posterior is inferred
while the rest are fixed using other inferences. The inference
problem is
\begin{equation*}
q(\mathbf{z}\mid\beta)q(\beta)\approx
p(\mathbf{z}, \beta\mid\mathbf{x}_{\text{train}})
\end{equation*}
where parameters in $q(\mathbf{z}\mid\beta)$ are estimated and
$q(\beta)$ is fixed.
%
In Edward, we enable conditioning by binding random variables to other
random variables in \texttt{data}.
\begin{lstlisting}[language=Python]
inference = ed.Inference({z: qz}, {x: x_train, beta: qbeta})
\end{lstlisting}

In the \href{/api/inference-compositionality}{compositionality page},
we describe how to construct inference by composing
many conditional inference algorithms.

\textbf{Implicit prior samples}.
Latent variables can be defined in the model without any posterior
inference over them. They are implicitly marginalized out with a
single sample. The inference problem is
\begin{equation*}
q(\beta)\approx
p(\beta\mid\mathbf{x}_{\text{train}}, \mathbf{z}^*)
\end{equation*}
where $\mathbf{z}^*\sim p(\mathbf{z}\mid\beta)$ is a prior sample.

\begin{lstlisting}[language=Python]
inference = ed.Inference({beta: qbeta}, {x: x_train})
\end{lstlisting}

For example, implicit prior samples are useful for generative adversarial
networks. Their inference problem does not require any inference over
the latent variables; it uses samples from the prior.

\begin{center}\rule{3in}{0.4pt}\end{center}

\begin{itemize}
  \item @{ed.inferences.Inference}
  \item @{ed.inferences.VariationalInference}
    \begin{itemize}
    \item @{ed.inferences.KLqp}
      \begin{itemize}
      \item @{ed.inferences.ReparameterizationKLqp}
      \item @{ed.inferences.ReparameterizationKLKLqp}
      \item @{ed.inferences.ReparameterizationEntropyKLqp}
      \item @{ed.inferences.ScoreKLqp}
      \item @{ed.inferences.ScoreKLKLqp}
      \item @{ed.inferences.ScoreEntropyKLqp}
      \end{itemize}
    \item @{ed.inferences.KLpq}
    \item @{ed.inferences.GANInference}
      \begin{itemize}
      \item @{ed.inferences.BiGANInference}
      \item @{ed.inferences.ImplicitKLqp}
      \item @{ed.inferences.WGANInference}
      \end{itemize}
    \item @{ed.inferences.MAP}
      \begin{itemize}
      \item @{ed.inferences.Laplace}
      \end{itemize}
    \end{itemize}
  \item @{ed.inferences.MonteCarlo}
    \begin{itemize}
    \item @{ed.inferences.Gibbs}
    \item @{ed.inferences.MetropolisHastings}
    \item @{ed.inferences.HMC}
    \item @{ed.inferences.SGLD}
    \item @{ed.inferences.SGHMC}
    \end{itemize}
  \item @{ed.inferences.complete_conditional}
\end{itemize}
